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ABSTRACT:   This  paper  develops  a  new  approach  to  robust 
specification  testing  for  dynamic  econometric  models.   A  novel 
feature  of  these  tests  is  that,  in  addition  to  the  estimation  under 
the  null  hypothesis,  computation  requires  only  a  matrix  linear  least 
squares  regression  and  then  an  ordinary  least  squares  regression 
similar  to  those  employed  in  popular  nonrobust  tests.   The 
statistics  proposed  here  are  robust  to  departures  from  i 

distributional  assumptions  that  are  not  being  tested.   Moreover,  the 
statistics  may  be  computed  using  any  T^'-con  sis  tent  estimator. 
Several  examples  are  presented  to  illustrate  the  generality  of  the 
procedure.   Among  these  are  conditional  mean  tests  for  models 
estimated  by  weighted  nonlinear  least  squares  which  do  not  require 
correct  specification  of  the  conditional  variance,  and  tests  of 
conditional  means  and  variances  estimated  by  quasi— maximum 
likelihood  under  nonnormality .   Also,  some  new,  computationally 
simple  tests  for  the  tobit  model  are  proposed. 


1 .  Introduction 

Specification  testing  has  become  an  integral  part  of  the 
econometric  model  building  process.   The  literature  is  extensive, 
and  model  diagnostics  are  available  for  most  procedures  used  by 
applied  econometricians.   By  far  the  most  popular  specification 
tests  are    those  that  can  be  computed  using  ordinary  least  squares 
regressions.   Examples  are  the  Lagrange  Multiplier  (LM)  test, 
versions  of  Hausman ' s  [B]  specification  tests,  White's  [14] 
information  matrix  (IM)  test,  and  an  LM  version  of  the 
Davidson-MacKinnon  [3]  test  for  nonnested  hypotheses.   In  fact, 
Newey  [10]  and  White  [16]  have  shown  that  most  of  these  tests  are 
asymptotically  equivalent  to  one  of  the  conditional  moment  (CM) 
tests  considered  by  Newey  [10],  Tauchen  [11],  and  White  [16].   In 
the  maximum  likelihood  setting  with  independent  observations,  Newey 
[10]  has  shown  how  to  compute  CM  tests  using  auxiliary  regressions. 
White  [16]  has  extended  Newey 's  results  to  a  general  dynamic 
setting . 

The  simplicity  of  the  regression— based  procedures  currently 
used  is  not  without  cost.   In  many  cases  the  validity  of  these  tests 
relies  on  certain  auxiliary  assumptions  holding  in  addition  to  the 
relevant  null  hypothesis.   For  example,  in  a  nonlinear  regression 
framework  where  the  dynamic  regression  function  is  correctly 
specified  under  the  null  hypothesis,  the  usual  LM  regression-based 
statistic  is  invalid  in  the  presence  of  conditional 
heteroskedasticity .   The  regression  form  that  falls  out  of  Newey 
[10]  or  White  [16]  is  also  usually  invalid.   Other  examples  are  the 


various  tests  for  heteroskedasticity :   currently  used  regression 
forms  require  constancy  of  the  conditional  fourth  moment  of  the 
regression  errors  under  the  null  hypothesis.   Finally,  Lfi  and  other 
CM  tests  for  jointly  parameterized  conditional  means  and  variances 
are  inappropriate  under  nonnormal ity .   All  of  these  situations  are 
characterized  by  the  same  feature:   validity  of  the  tests  requires 
imposition  of  more  than  just  the  hypotheses  of  interest  under  H^. 
Furthermore,  traditional  testing  procedures  require  that  the 
estimators  used  to  compute  the  statistics  are  efficient  in  some 
sense  under  the  null  hypothesis.   It  is  important  to  stress  that 
this  is  not  merely  nitpicking  about  regularity  conditions. 

Due  primarily  to  the  work  of  White  [12,13,14,15],  Domowitz  and 
White  [4],  Hansen  [6],  and  Newey  [10],  there  now  exist  general 
methods  of  computing  robust  statistics.   Unfortunately,  for  general 
classes  of  specification  tests,  computing  robust  versions  using 
currently  available  methods  is  burdensome.   This  is  particularly 
true  of  LM-like  tests  where,  at  least  based  on  currently  available 
formulas,  analytically  solving  for  the  derivative  of  the  implicit 
constraint  function  and  computing  generalized  inverses  are    needed 
for  computation.   Several  authors  have  even  claimed  that,  contrary 
to  the  case  of  the  Wald  statistic,  there  are  no  useful  robust  forms 
cf  the  LM  statistic. 

It  is  a  safe  bet  that  the  substantial  analytical  and 
computational  work  required  to  obtain  robust  statistics  is  the 
reason  they  appear  relatively  infrequently  in  applied  econometric 
work.   Evidence  of  this  statement  is  the  growing  use  of  the  White 


[12]  heteroskedasticity— robust  t-statistics ,  which  are  now  computed 
by  fTiany  econometrics  packages.   In  the  same  papers  one  rarely  sees 
an  Lli  test,  a  Hausman  test,  or  a  nonnested  hypothesis  test  carried 
out  in  a  manner  that  is  robust  to  second  moment  misspecif ication . 
This  is  unfortunate  since  these  tests  are  inconsistent  for  the 
alternative  that  the  conditional  mean  is  correctly  specified  but  the 
conditional  variance  has  been  misspecif ied .   In  other  words,  the 
standard  forms  of  well  known  tests  can  result  in  inference  with  the 
wrong  asymptotic  size  while  having  no  systematic  power  for  testing 
the  auxiliary  assumptions  that  are  imposed  in  addition  to  H-. 

This  paper  develops  a  unified  approach  to  calculating  robust 
statistics  which  I  believe  is  easily  accessible  to  applied 
econometricians.   It  is  shown  that  a  general  class  of  tests  can  be 
obtained  using  only  linear  least  squares  regressions.   These  tests 
maintain  only  the  hypotheses  of  interest  under  the  null,  and  are 
applicable  to  specification  testing  of  dynamic  multivariate 
conditional  means  and/or  conditional  variances  without  imposing 
farther  assumptions  on  the  conditional  distribution  (except 
regularity  conditions).   In  classical  situations,  these  tests  are 
asymptotically  equivalent  to  their  traditional  counterparts  under 
the  additional  assumptions  needed  to  make  the  standard  tests  valid. 
T-^arBovsr ,     because  the  statistics  may  be  computed  using  any 
VT-consistent  estimator,  the  methodology  leads  to  some  interesting 
new  tests  in  cases  where  the  computational  burden  based  on  previous 
approaches  is  prohibitive. 

The  remainder  of  the  paper  is  organized  as  follows.   Section  2 


discusses  the  setup  and  the  general  results.  Section  3  illustrates 
the  scope  of  the  methodology  with  several  examples,  and  Section  4 
contains  concluding  remarks.   Regularity  conditions  and  proofs  are 
contained  in  an  appendix. 

2.  General  Results 


Let  CCY.jZ.):  t=l,2,...}  be  a  sequence  of  observable  random 
vectors  with  Y   IxJ,  Z   IxK.   Y   is  the  vector  of  endogenous 
variables.   Interest  lies  in  explaining  Y   in  terms  of  the 
explanatory  variables  Z.  and  (in  a  time  series  context)  past  values 
of  Y   and  Z  .   For  time  series  applications,  let  X   = 


riables  and  let 


( Z  , Y    , Z    , . . . , Y  , Z  )  denote  the  predetermined  va 

^,     c=  K  ^  denote  the  support  of  X  =   For  cross  section 

applications,  set  X   =  Z  . 

The  conditional  distribution  of  Y   given  X   =  x   always  exists 
and  is  denoted  D ,  (  •  |>:j_).   Assume  that  the  researcher  is  interested 
in  testing  hypotheses  about  a  certain  aspect  of  D  ,  for  example  the 
conditional  expectation  and/or  the  conditional  variance.   Note  that, 
because  at  time  t  the  conditioning  set  contains  C ( Y    , Z    )  ,  .  .  .  , 
(Y  ,Z  )>,  the  assumption  is  that  interest  lies  in  getting  the 
dynamics  of  the  relevant  aspects  of  D   correctly  specified.   For 
cross  section  applications,  this  point  is  of  course  irrelevant. 

Many  specification  tests,  including  those  for  conditional  means 
and  variances,  have  asymptotically  equivalent  versions  that  can  be 

derived  as  follows.   Let  "n.  (Y  ,X  ,©)  be  an  Lxl  random  function 

p 
defined  on  a  parameter  set  ©  c  K  ,  and  let  <P.  (X  ,©)  be  an  Lxl 


function  also  defined  on  ©.   Note  that  t\.  depends  on  Y   whereas  f. 
depends  only  on  the  predetermined  variables.   The  null  hypothesis  of 
interest  is  expressed  as 

H,^:       EC-a.  (Y.  .X^,e    )|X.]    =    <p.(X.,e    ),    for    some    8      e   ©,  (2.1) 

O  tttot  tto-  o 

t=l , 2 , . . . • 
The  leading  case,  and  the  one  emphasized  in  this  paper,  is  when 
■i<P^(x^  !■©)  :  5<a.  «  -^^f    e  <£  ©}  is  a  parameterized  family  for  the    ^ 
conditional  mean  and/or  conditional  variance  of  Y   given  X   =  k  . 

The  validity  of  (2.1)  can  be  tested  by  choosing  functions  of  the 
predetermined  variables  X   and  checking  whether  the  sample  covariances 
between  these  functions  and 

are    significantly  different  from  zero.   It  is  useful  to  allow  the 
indicators  to  depend  on  9  and  some  nuisance  parameters.   Let  tt  e  Fl 
denote  a  NkI  vector  of  nuisance  parameters,  and  let  6    =  (©'  ,n'  )'  be 
the  Mxl  vector  of  all  parameters  where  M  =  P+N.   Let  A  (X  ,6)  be  an 
L;cQ  matrix  and  let  C  (X  ,6)  be  an  LxL,  symmetric  and  positive 
semi— definite  matrix.   Assume  the  availability  of  an  estimator  Qj    such 


i  /^^   -^ 


that  T  '*"(©-  -  e  )  =  0  (1)  under  H..   Also  assume  that  the  nuisance 
lop  (J 

•^  1  /"?■"■  Q  o 

parameter  estimator  tt_  is  such  that  T   ^(tt^  -  n^)  =0  (1),  where  [nT.: 

T  T     T      p    ■  1 

T=l,2,...}  is  a  nonstochastic  sequence  in  Ti.   Then  a  computable  test 
statistic  is  the  Qxl  vector 

1     ^.  j^  •^        -I  '  ^.    .^.      y.  ^^ 

T    L   A;C.*t  =      T         L      K^t^rx.     -    'P.)  (2.2) 

t=l   '-'-'-         t=l 

where  "•^"  denotes  that  each  function  is  evaluated  at  ©_  or  <S^  = 


O'  ,n'  )'  (note  that  the  dependence  of  the  summands  in  (2.2)  on  the 
sample  size  T  is  suppressed).   From  a  theoretical  standpoint,  the 
p.s.d.  matrix  C   could  of  course  be  absorbed  into  A  ,  and  (p   could  be 
absorbed  into  n^ .  but  the  structure  in  (2.2)  is  exploited  below  to 
generate  regression-based  tests  with  the  additional  property  that  they 
Are    asymptotically  equivalent  to  standard  tests  under  classical 
circumstances. 

To  use  (2.2)  as  a  basis  for  a  test  of  (2.1),  the  limiting 
distribution  of 

t=i 

under  H^  is  needed.   In  general,  finding  the  asymptotic  distribution 
of  ^j    entails  finding  the  limiting  distribution  of 

K°      ^      T-''^    E   A°'C°^°  (2.4) 

t=l 

(values  with  "o"  superscripts  are    evaluated  at  9   or  6^    =    (6'  ,17^.'  )'  ) 

o      I      o '   I 

1/2  '^ 
and  the  limiting  distribution  of  T    ( ©^  -  ©  )  (the  limiting 

I     o 

distribution  of  T    (rr^  -  rr^)  does  not  affect  the  limiting 
distribution  of  f   under  H.).   Because  ^   is  the  standardized  sum  of 
a  vector  martingale  difference  sequence  under  H. ,  its  limiting 

distribution  is  frequently  derivable  from  a  central  limit  theorem. 

1 Z''  '^ 
In  standard  cases  T"^  ^  ( S-r  ~  ©  )  will  also  be  asymptotically  normal. 

T     o 

Given  the  asymptotic  covariance  matrices  of  f   and  T   ~ ( ®x  ~  ®  ^  and 
differentiability  assumptions  on  A  ,  C  ,  and  <p    ,    it  is  possible  to 
derive  the  asymptotic  covariance  matrix  of  ^   by  the  usual  mean 
value  expansion.   In  principle,  deriving  a  quadratic  form  in  ?^ 
which  has  an  asymptotic  ')C    distribution  is  straightforward.   But 


nothing  guarantees  that  the  resulting  test  statistic  is  easy  to 
compute. 

In  specific  instances  test  statistics  based  on  ^   can  be 
computed  from  simple  DLS  regressions.   For  example,  Newey  [103  snd 
White  [16]  have  shown  how  statistics  based  on  covariances  of  the 
form  (2.2)  can  be  computed  from  simple  auxiliary  regressions  when  ©^ 


is  the  maximum  likelihood  estimator  and  the  conditional  density  is 
correctly  specified  under  H  . 

In  general,  the  regression-based  statistics  appearing  in  the 
literature  have  the  drawback  that  they  are  not  robust  to  certain 
departures  from  distributional  assumptions.   For  example,  suppose 
interest  lies  in  testing  hypotheses  about  the  conditional 
expectation  of  Y   (taken  to  be  a  scalar  for  simplicity)  given  X  . 
The  parametric  model  is 

<:m^(x^,e):  x^  e  ^^,  6  e  ©} ,  (2.5) 

where  ©  <z  K  ,  and  the  null  hypothesis  is 

Hj^:  E(Y^|X^)  =  m^(X^,e^),  some  9^  e  ©,  t=l,2, (2.6) 

Setting  L  =  1,  C^(,X^,6)    =    1,  ^^(Y^jX^.e)  =  Y^,  and  ^p^(X^,e)  = 
m^(X^,e)  in  (2.1)  yields  a  class  of  tests  based  on 
^-1 


T  ^  E  \(^^,'^j)'ll^  (2.7) 

t=l 


where  U   =   Y   -  m  ( X  , G^ ) ,  6^  is  the  nonlinear  least  squares  (NLL5) 
estimator,  X. (X  ,6)  is  a  IxQ  vector  function  of  misspecif ication 
indicators,  and  6  is  a  vector  containing  9  and  possibly  other 
nuisance  parameters.   The  standard  LM  approach  leads  to  a  test  based 
on  the  (uncentered)  R^  from  the  regression 


U^       on       V^m^,    X^  t=l T.  (2.8) 

Under    H      and    conditional    homoskedasticity ,    TR^    is    asymptotically    X^. 
Thus,     the    Lti    approach    effectively    takes    the    null    hypothesis    to    be 

^o'  '    ^0    ^°^^^    ^"'^    V(Y^|X^)    =    CT^    for   some    a^    >    O,     t=l,2,...        (2.9) 

but  it  is  of  course  an  inconsistent  test  for  the  alternative 

H^'  i    Hq  holds  but  Hq'  does  not. 
The  regression  form  from  Newey  [10]  and  White  [16]  is 


/^  y\  rfX  .-x 


1    on    U.V  m     U^X^       t=l,...,T.  (2.10) 

t  s  t     t  t 

-} 
In  general  ,  H  '  is  also  required  for  TR"^  from  this  regression  to  be 

2   1 
asmptotical ly  X^. 

There  are    many  other  examples  where  the  goal  is  to  test 
hypotheses  about  certain  aspects  of  a  distribution  but  auxiliary 
assumptions  are  maintained  under  the  null  hypothesis  in  order  to 
obtain  a  simple  regression-based  test.   Because  the  limiting 
distributions  of  test  statistics  can  be  sensitive  to  violations  of 
the  auxiliary  assumptions,  it  is  important  to  use  robust  forms  of 
tests  for  which  H   includes  only  the  hypotheses  of  interest.   But  as 
mentioned  above,  applying  the  standard  mean— value  approach  to  the 
general  statistic  ?_  results  in  a  statistic  for  which  computation 
can  be  prohibitively  burdensome. 

A  relatively  simple  statistic  is  available  if  ?-j.  is 

appropriately  modified.   Assume  that  Q      €  int(©)  and  that  ^p.     is 

o  t 

dif f erentiable  on  int(6).   Then,  instead  of  using  the  indicator 

A'C  ,  the  idea  is  to  first  purge  from  C   "A   its  linear  projection 

"1/2   '^ 
onto  C.   '^^^^'        That  is,  consider  the  modified  statistic 


8 


—  i/'f  -"•1  z'T"^  ■"■^  y^      A  xs   ^.  1  /o-^ 

?  =  T    y  rc   A  -  c   V  (p  B  ]'C   * 

^T         til  9^t  T-"   t   ^t 


where 


B. 


f  T 


-1  T 


y^      .■^      j^ 


r  V  (p'  c  A 
^  e^t  t  t 


(2.11) 


(2.12) 


is  the  PkQ  matrix  of  regression  coefficients  from  the  regression 

(2.13) 


C^  ^A^   on   C^  '"^e^t    t=l,...,T. 


Equation  (2.11)  can  be  written  more  succinctly  as 

-1/'?  '  ,,  ~ 

?^  ^  T  "^'^  j:  a; 4, 


t=i 


t^t 


(2.14) 


where  {A  :  t=l,...,T}  are  the  residuals  from  the  regression  in 


(2.13)  and  <p^    =   C^   4.^. 


It  is  important  to  note  that  ?^  and  kj    3.re 


not  always  asymptotically  equivalent  in  the  sense  that  ^   -  ^_  ->■  0 
under  H  .   In  general,  the  indicators  A'C   and  [A   -  V^(p  B-p]'C  are 
useful  for  checking  different  departures  from  (2.1).   I  return  to 
this  issue  below. 

Even  when  ^   and  ^   are  not  asympotical  ly  equivalent,  k-^.    can  be 
used  as  the  basis  for  a  useful  specification  test.   The 
computational  simplicity  of  a  limiting  X^  quadratic  form  in  5   is  a 
consequence  of  the  following  theorem. 

Theorem  2.1:   Assume  that  the  followinq  conditions  hold  under  H  : 


(i)   Regularity  conditions  A.l  in  the  appendix; 

(ii)   For  some  6   <=  int(©), 

o 

(a)   ECn,  (Y.  ,X.  ,e  )|X.]  =  (p.(X.,G  ),  t=l,2, 
tt'to    t       ttO' 


(b)  E[v^n^(Y^,x^,e^) |x^]  =  0,  t=i, 

1  / ''  '■"       o 

(c)  T^'~{6^    -  6°)  =  0  (1). 

T     T       p 


Then 


e,  =    t-^'\e^ca°-    v^,.°B°rc°*°      .      =^,1) 


where 


b: 


r  T 


''J^EtV^.°.cX3 


In  addition. 


TR' 


^. 


where  R*"  is  the  uncentered  r-squared  from  the  regression 
1    on    *t^t'-^t  ~   '^e'^t^T-^      t=l,...,T 


(2.15) 


and  B-p  is  given  by  (2.12) 


Theorem  (2.1)  can  be  applied  as  follows: 


j^         j^         j^         j^ 


(1)  Given  A  ,  C  ,  ti,  ,  (p.,    and  6    ,    compute  A  ,  C,  ,  n.  ,  <p,  ,  anc 
V^^^.   Define  A^  =  C^  "^A^ ,  7^^^  =  C^  ''^e'^t'  ^"^  *t  "  ^t  "*t' 

(2)  Run  the  matrix  regression 


A^   on   '^Q^^     "t=l ,  .  .  .  ,  T 


(2.16) 


and  save  the  residuals,  say  A  ; 
(3)   Run  the  regression 


1    on   'I'lA 


t=l T 


and  use  TR^  as  asymptotically  XT  under  H  ,  assuming  that  A   does  not 
contain  redundant  indicators.   ■ 


Note  that  condition  (ii.b)  is  an  additional  restriction  on  r\. 
that  must  be  satisfied  in  order  for  (l)-(3)  to  be  a  valid  procedure 
under  H  .   This  assumption  rules  out  certain  specification  tests, 
but  is  applicable  to  the  leading  case  of  diagnostics  for  conditional 


10 


means  (hence  conditional  probabilities)  and/or  conditional 

variances.   These  acb    usually  the  cases  where  one  would  like  to  be 

robust  against  other  distributional  departures.  " 

Assumption  (ii.c)  is  perhaps  more  properly  listed  as  a 

regularity  condition,  but  it  is  placed  in  the  te;<t  to  emphasize  the 

generality  of  Theorem  2.1.   Having  a  VT-consistent  estimator  of  6^ 

is  a  fairly  weak  requirement,  and  allows  relatively  simple 

specification  tests  when  G   (as  well  as  n,.)  has  been  estimated  by  an 

o  I 

inefficient  procedure.   An  application  to  the  tobit  model  is  given 
in  Section  3. 

An  important  issue,  mentioned  earlier,  is  the  relationship 
between  ^  and  ^  .  There  is  a  simple  characterization  of  their 
asymptotic  equivalence. 

Lemma  2.2:   Let  the  conditions  of  Theorem  2.1  hold.   If,  in 
addition , 

(iii)   t"^^^  £   ^e'^t'^t^'^t  "  "^t^   "  °p^^^' 


then 

?-p  -  ?^   =   o  (1).   ■  (2.17) 

The  importance  of  this  lemma  is  that  if  (iii)  holds  then  the 
modified  indicator  is  testing  for  departures  from  H   in  the  same 
directions  as  the  originally  chosen  indicator.   When  (2.17)  holds,  the 
statistics  based  on  quadratic  forms  in  ?   and  ^   are  asymptotically 
equivalent.   This  is  useful  when  comparing  tests  derived  from  Theorem 
2.1  to  more  traditional  forms  of  tests. 

Condition  (iii)  is  usefully  interpreted  as  the  sample  covariBnce 


11 


between  [C   ^'^Q^t'  t^^'--!-'''^  ^"^     i^t   *+-'  t=l,...,T>  being  zero.   It 

is  trivially  satisfied  if 
T 

E  '7^^^(e)'c^(e,n^)[T\^(e)  -  .p^(e)]  =  o  (2.ib) 

is  the  defining  first-order  condition  for  6^.   This  is  frequently  the 
case,  particularly  when  ©^  is  a  quasi-maximum  likelihood  estimator 
(QMLE)  of  the  parameters  of  a  conditional  mean  (see  Wooldridge  [19]) 
ori  of  the  conditional  mean  and  conditional  variance  (see  Example  3.3 
below).   Note  that  in  these  cases  (2.17)  holds  (trivially)  for  local 
alternatives.   Therefore,  the  difference  between  the  test  based  on 
?_  and  a  more  traditional  nonrobust  test  based  on  ?^  (e.g.  an  LM 
test)  is  simply  that  different  estimators  have  been  used  for  the 
moment  matrix  appearing  in  the  quadratic  form.   Consequently,  under 
the  conditions  required  for  the  classical  test  to  be  valid,  the  two 
procedures  are  asymptotically  equivalent  under  local  alternatives. 
The  robust  test  has  the  advantage  of  having  a  limiting  noncentral  X" 
distribution  even  when  the  auxiliary  assumptions  are  violated  under 
local  alternatives  (e.g.  heteroskedasticity  is  present  in  a  dynamic 
regression  model ) . 

3.  Examples  of  Regression-Based .  Robust  Tests 

Example  3.1:   Let  Y   be  a  scalar  and  let  -Cm^(x^,P):  x   e  J^^,  B  s  B}, 
B  c  K  ,   be  a  parametric  family  for  the  conditional  expectation  of 
Y.  given  X  .   The  null  hypothesis  is 

H-,:  E(Y^|X.)  =  m.(X.,3  ),  some  R   e  B,  t=l,2 .     (3.1) 

yJ  i-t       ttO'  o 

Let  [c  (x  ,a):  x   e  Jf    ,    a  e  A}  be  a  sequence  of  weighting  functions 
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such  that  c  (x  ,«)  >  0,  and  suppose  that  oc^  is  an  estimator  such 

.1/2 

( oc_  -  a_  J  =  L 
P 


that  T    (oc^  -  a  )  =  0  (1),  where  [cx-^}  c  A.   It  is  not  assumed  that 


[c  (x  ,a):  ;;   e  Jf    ,    a  e  A}  contains  a  version  of  V(Y  |X  ).   The 

researcher  merely  chooses  the  set  of  weights  {c  ( x  ,  ol^)  }  and 

performs  weighted  NLLS  ( WNLLS ) .   The  WNLLS  estimator  p^  solves 
T 
E  ^p^^t^P^' "^^t  "  "'t^P^^'''^t^"T^  =  ^-  ^^'^^ 

A  general  class  of  diagnostics  is  based  on 

^   \^(6^)'  [c^(^)]~^[Y^  -  m^(P^)]  (3.3) 

where  6    can  contain  p,  ex.   and  other  nuisance  parameters.   Letting  ©  = 
p,  C^(6)  =  [c^(a)]~-^,  Ta^(e)  =  Y^,  and  ^p^(e)  =  m^O),  it  is  easy  to 
see  that  conditions  (ii.a)  and  (ii.b)  of  Theorem  2.1  hold  under  H  . 
Condition  (ii.c)  will  also  usually  be  satisfied.   Because  (iii)  of 
Lemma  2.2  holds,  the  statistic  obtained  from  Theorem  (2.1)  is 
asymptotically  equivalent  to  the  statistic  based  on  (3.3).   The 
following  procedure  is  valid  under  H.,  without  any  assumptions  about 
_  V(Y^|X^)  (except,  of  course,  regularity  conditions): 

(i)  Estimate  p   by  WNLLS.   Compute  the  residuals  U  ,  the  gradient 

V„m  (b\-),  and  the  indicator  X^(6_).   Define  U^  =  cT^'^'^LJ^,  7^m^  = 
Pi-i  ti  tttpt 

'"—1/2      ■'■  ~  ■"■—l/^': 

^t    '^p'^t'  ^^^  ^t  =  ^t   '^t  = 

(ii)  Regress  X   on  V^m,  and  keep  the  residuals,  sav  X..  ; 

t      tj  t         "^  ■    '   t  ■ 

(iii)  Regress  1  on  U  X   and  use  TR^  from  this  regression  as 

r> 
asymptotically  "X^  under  H  . 

The  indicator  X   can  be  chosen  to  yield  heteroskedastici ty- 
robust  LM  tests,  Hausman  tests  based  on  two  WNLLS  regressions  which 


do  not  assume  that  either  estimator  is  relatively  efficient,  and 
tests  of  nonnested  hypotheses,  such  as  the  Davidson-MacKinnon  [3] 
test,  which  are  valid  in  the  presence  of  heteroskedasticity .   These 
tests  are  considered  in  more  detail  in  Wooldridge  C19]. 

Example  3.2:   Suppose  now,  in  the  context  of  Example  3.1,  c. (a)  is 
set  to  1  and  the  goal  is  to  test  the  assumption  of  homoskedasticity 
(actually,  the  goal  is  to  test  the  joint  assumption  of  correctness 
of  the  conditional  mean  and  homoskedasticity ) =   In  particular,  the 
null  hypothesis  is 

H-.       E(Y.   |X.)    =    m.(X..p    ),    V(Y.   |X.)    =    a^      some    p„    e   B,  (3.4) 

O  tt  tto  tt  o  o 

some    cr^    >    0,     t=l,2 

o 

In  the  notation  of  Theorem  2.1,  G  =  (^',a^)'.       Let  P-  be  the  NLLS 

estimator,  and  let  U   be  the  NLLS  residuals.   Let  X  ( X  ,  <S )  be  a  IxQ 

vector  of  indicators.   Most  tests  for  heteroskedasticity  Are    based 

on  a  statistic  of  the  form 

T  ■"  J]  X  '  [U;  -  a^l  (3.5) 

t=l 

''■ '?  — JL   •"•'7 

where  a!!l  is  the  usual  estimator  T   J]  y^.   Choosirig  X(X_,6)  to  be  the 

t=l 
nonconstant,  nonredundant  elements  of  vech  [ V  m  ( P ) ' V  m^( P ) ]  leads 

to  the  White  [12j  test  for  heteroskedasticity.   Choosing  >v^(X^,6)  = 

X^^ ,  where  X^,  is  a  IxQ  subvector  of  X.,  leads  to  the  Lagrange 
tl ■  tl  t' 

Multiplier  test  for  a  general  form  of  heteroskedasticity  (see 
Breusch  and  Pagan  [1]).   Setting  X  (X  ,6)  =    (U^_^  (  P ),-..,  U^_jg  (  3 )  ) 
gives  Engle's  [5]  test  for  ARCH. 

The  correspondences  for  Theorem  2.1  are  L  =  1,  C.  (6)  =  1,  T\^(©) 
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=  U"0),  and  4-'  (0)  =  a".   Under  H^^ ,  E[U"0  )|X  ]  =  cr^  so  that  (ii.a) 

of  Theorem  2.1  is  satisfied.   Also,  V-,UT;0)  =  -2V_m .  O  )  U .  O  )  .   Under 

P    t  ji    t  t 

H.,    ECU^CP    )|X.]    =    0    so    that    E[V„U^O    )|X.]    =    0    and     (ii.b)     is    holds. 
0'  tot  ptot 

In    this    case,     the    relevant    element    of    V    ^p      is    simply    1.       Thus,     the 

u  t 

auKiliary  regression  in  the  second  step  of  the  robust  procedure 


j^.'-i  y".' 


simply  demeans  the  indicators.   Given  U*^,  ct!^,  and  a  choice  of  \  , 
the  yr    statistic  is  obtained  as  TR^  from  the  regression 

1   on    (U^-  cr^)(>^t  "  ^^      t=l,...,T  (3.6) 

where  X^    =  T   E  ^4- -   This  procedure  is  asymptotically  equivalent  to 
t=l 

the  corresponding  more  traditional  forms  of  the  tests  under  the 

4 
additional  assumption  that  E[U  (P  )|X  ]  is  constant  (note  that  (iii) 

of  Lemma  2.2  is  satisfied).   Interestingly,  the  slight  modification 

in  (3.6)  (which  is  the  demeaning  of  the  indicators  X  )  yields  an 

asymptotically  X^  distributed  statistic  without  the  additional 

assumption  of  constant  fourth  moment  for  U  .   In  the  case  of  the 

White  test  in  a  linear  time  series  model,  the  demeaning  of  the 

indicators  yields  a  statistic  which  is  asymptotically  equivalent  to 

Hsieh's  [9]  suggestion  rfar  a  robust  fOrm  of  the  White  test,  but  the 

above  statistic  is  significantly  easier  to  compute.   Rarely  does  one 

care  to  assume  anything  about  the  fourth  moment  of  Y  ,  so  that  the 

robust  regression  form  in  (3.6)  seems  to  be  a  useful  modification. 

In  the  case  of  the  ARCH  test,  TR~  from  the  regression  in  (3.6) 

IS  asymptotically  equivalent  to  TR^  from  the  regression 

1   on   (U^-crrjI)  (U^_^-cr:p)  ,  .  .  .  ,  (U^-alp)  (U^_Q-cr^)   t=Q+l,...T.    (3.7) 

The  regression  based  form  in  (3.7)  is  robust  to  departures  from  the 
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conditional  normality  assumption,  and  from  any  other  auxiliary 

Q 

assumptions,  such  as  constant  conditional  fourth  moment  for  U  . 
Contrast  this  to  the  usual  method  of  computing  tests  for  ARCH.   ■ 

Example  3.3:   Theorem  2.1  can  also  be  applied  to  models  that  jointly 
parameterize  the  conditional  mean  and  conditional  variance.   The 
general  setup  is  as  follows.   For  simplicity,  let  Y   be  a  scalar, 
and  consider  Lti  tests  which  do  not  assume  conditional  normality. 
The  unconstrained  conditional  mean  and  variance  functions  are 


M 
where  F   c  \R    ,       It  is  assumed  that 


(3.e) 


E(Y.  |X.)  =  Ma.(X.,¥  ),   V(Y.  |X.)  =  u).(X.,X  ),  some  X   e  T.   (3.9) 

tt        ttO'         tt        ttO'  o 

Take  the  null  hypothesis  to  be 

for  some  0   e  ©  c  K^  (3.10) 

o 

where  P  <  Fi  and  r  is  continuously  dif f erentiable  on  int(0).   Let 


"0=   ^o  =  -^^^o^ 


m.(©)  =  M.  (r(©))  and  w  (©)  =  co  (r(e))  be  the  constrained  mean  and 

variance  functions.   QMLE  is  carried  out  under  the  null  hypothesis. 

Let  e^.  be  the  estimator  of  6   under  H^  ,  and  let  ^   =  r(e^)  be  the 
T  o         0  T       T 

constrained  estimator  of  t     .       V^m^  and  V^w^  Are    the  IxP  gradients  of 

o     ©  t       ©  t 


j^.  X-. 


on , 


m   and  w   under  H  .   Note  that  cu^   =  w   and  (4.  =  m   by  definiti 
The  LM  test  of  (3.10)  is  based  on  the  unrestricted  score  of  the 
quasi-log  likelihood  evaluated  at  ^.j..   The  transpose  of  the  score  is 

s^(¥)'  =  7^u^(i;)'U^(X)/a5^(i:)  +  V^o>^(x)'  [U^(X)  -  o^^(¥)  ]/2w^(  Jf )  (3.11) 


rv^M^(n 
y^c.^(v) 


■i/w^(i;) 


l/2a)^(i;)- 


U^(i;) 


(3.12) 
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Evaluating    s      at    r(0)    gives 


5^(r(e))'  =  A^(G)'c^(e)CTi^(e)  -  <p^(e)] 


(3.13) 


where  A  (6)'  =  [V  M.(r(e))'   !  V  co  (r(e))'],  C  (6)  is  the  diagonal 
matrix  in  the  middle  of  (3.11)  evaluated  at  r(9),  n. (G)'  = 
i:U^(r(e)),  U^(r(e))],  and  (p  (G)'  =  [m  (G),  w  (G)].   The  standardised 
score  evaluated  at  V-  is 


.-1/2  ^ 


t=l 


(3.14) 


Under  H   and  the  assumption  of  conditional  normality,  TR   from  the 


regression 

1 


on 


t 

•7 


t=l, . . . ,T 


(3.15) 


is  asymptotically  XT,  where  Q  =  M  -  P  is  the  number  of  restrictions 
under  H  .   Unfortunately,  this  procedure  is  invalid  under 
nonnormality .   Theorem  (2.1)  suggests  a  robust  form  of  the  test.   In 


this  case, 


2:<P 


^e-^t^ 

V  w 
■  G  t- 


^t  - 
2>:M 


^t  - 


■1/w^     0   ^     -    _  r    "^t   ' 

^   *^     "-'-^l    i  2.1       >^  '^^  -  '^t 

where  U   =  Y   -  m,{e^).       The  transformed  quantities  are 
t     t     t   T  ^ 

f  V^M^/Vw 


^G^t  = 


'   V^m^/Vw^       - 

■*\# 

j^             ./^ 

« 

^t  - 

L   7^w^/w^y2   J 

i;^f  ■  t 
7..c>>  /w  y2 


'^   ¥ 


(3.16) 


U^/Vw^ 


L  cuj  -  w^]/w^y2  J 
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The  robust  test  statistic  is  obtained  by  first  running  the 
regression 

A^   on   V^J^     t=l,...,T  (3.17) 

and  saving  the  matrix  residuals  -[A  :  t=l,...,T}=   Then  run  the 
regression 

1    on    J^A^     t=l,...,T  (3. IB) 

2        '  r? 

and  use  TR^  as  asymptotically  ')Cl    under  H  .   Note  that  the  regression 

^ ,      j^ 

in  (3.18)  contains  perfect  multicol  linearity  since  A-V^rO^.)  =  0, 

t  ©    I 

where  V  r(e)  is  the  MxP  gradient  of  r.   Many  regression  packages 

2 

nevertheless  compute  an  R  ;  for  those  that  do  not,  P  regressors  can 

be  omitted  from  (3.18). 

Note  that  the  first  order  condition  for  ©   is  simply 
T  -     ^ 

E  "^©^t^^T^'^t^^T^^^t^^T^  ~  ^t^®T^^   =  ^'  (3.19) 

t=l 

so  that  the  robust  indicator  is  asymptotically  equivalent  to  the 
usual  LM  indicator.   The  matrix  regression  in  (3.17)  is  the  cost  to 
the  researcher  in  guarding  against  nonnormality .   ■ 

Example  3.4:   Suppose  that  Y   is  a  random  scalar  censored  below 

zero,  and  let  X    be  a  IxP  vector  of  predetermined  variables  from 

X  .   A  popular  model  for  Y  is  the  tobit  model.   The  tobit  model 

implies  that 

E(Y.  |Y. >0,X^)  =  X^  R   +  a   v(X^  R  /a    )  (3.20) 

t   t   ■  t      tl  o     o    tl  o   o 

where  v(-)  is  the  Mills  ratio,  c"    is  the  conditional  variance 

o 

usually  associated  with  the  "latent"  variable,  and  X,,P   is 

tl  o 

conditional  mean  of  the  latent  variable.   From  a  statistical  point 
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of  view,  the  tobit  model  is  no  more  sensible  than 

log  Y^|Y^>0,X^  ^   '^^^tl"o'"o^  (3.21) 

((3.21)  also  seems  reasonable  for  many  economic  applications).   If 

(3.21)  is  valid,  ex   and  co"  can  be  estimated  by  OLS  of 
■   o       o 

log  Y    on   X      t=l,...,T 
using  only  the  positive  values  of  Y  .   Recall  that  (3.21)  implies 

E(Y^|Y^>0,X^)  =  explJ^/T.   +    X^^oc^].  (3.22) 

Let  X   =  expCw^/2  +  X4.i«j3  be  the  fitted  values  in  (3.22).   Then,  if 
the  tobit  model  is  true,  X      should  be  statistically  insignificant  as 

a  regressor  in  equation  (3.20).   Let  P_,  oZ.    be  any  T^T-consistent 

2 
estimators  of  p  ,  ct   under  H^.   These  include  Heckman's  [7]  two-step 

estimators.   Let 

^t  -^  -  ^^tA  -  V^^tlV^T>- 
A  test  which  should  have  some  power  for  testing  departures  from  the 

tobit  model  can  be  based  on  the  correlation  between  L)   and  X  . 

Unfortunately,  the  usual  LM  statistic  is  invalid  for  two  reasons. 

First,  V(Y  !Y  >0,X  )  is  not  constant,  and  second,  the  estimators 

(p-|-,CT^)  need  not  have  been  obtained  from  a  nonlinear  least  squares 

problem.   Nevertheless,  a  statistic  is  available  from  Theorem  2.1. 

Let 

<P^(B,cr)  =  X^^P  +  CTv(X^^P/cr) 

and  let  V  ip   denote  the  1  :<    (P+1)  gradient  of  ip   with  respect  to  P 

©  t  t 

and  cr,  evaluated  at  (p^.cr^).   Then  the  following  procedure  is 
asymptotically  valid: 

(i)   Run  the  OLS  rearession 


X^   on   ^Q-P^    t=l 
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and  save  the  residuals  X, . 

(ii)   Run  the  regression 

1    on   U^X^    t=l,...,T 

2  '^ 

and  use  TR   as  asymptotically  XT  under  H  . 

Note  that  weighted  least  squares  could  also  be  used,  where  the 

weight  corresponds  to  the  inverse  of  V(Y  |Y  >05X  )  under  the  tobit 

models   If  c.  is  an  estimate  of  this  variance,  replace  X.  and  V.^p. 
t  '  t       S  t 

by  X  /Vc.  and  V  <p  /Vc.,  respectively  in  (i),  and  replace  U   by 
t     t        t3  t     t  "t 

U  /-/c,  in  (ii).   Although  it  intuitively  makes  sense  to  use  the 
weighted  version,  it  is  not  possible  to  say  one  approach  is  better 
than  the  other  without  more  information  about  the  origins  of  p^  and 

One  can  of  course  change  the  roles  of  the  models,  and  test  for 
a  significant  covariance  between  X   =  X.  P^  +  cr^v(X.  P^/a^)  and  the 
residuals  based  on  (3.22).   In  this  case,  the  purging  regression 
takes  the  form 

X^   on   e::p[co^/2  +  ^tl'^-'^tl    t=l,..,T.  (3.23) 

Note  that  a  similar  test  could  be  based  on  competing 
specifications  for  E(Y  ]X  );  that  is,  the  zero  as  well  as  positive 
observations  for  Y.  can  be  used.   This  would  require  specifying 
P(Y^>0|X^)  in  the  competing  model  (3.21)  such  as  in  Cragg  [2]. 

yx 

Finally,  many  other  indicators  could  be  included  in  X  ,  such  as 
the  gradient  of  the  competing  conditional  mean  function:   X   = 
e;<pCw^/2  +  ^tl'^-'^tl  ^"  ^^^    case  of  (3.21).   I  do  not  know  the  power 
properties  of  these  tests.   They  are  included  here  primarily  to 
illustrate  the  scope  of  Theorem  2.1.   ■ 
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4.  Conclusions 

This  paper  has  developed  a  general  class  of  specification  tests 
for  dynamic  multivariate  models  which  impose  under  H   only  the 
hypotheses  being  tested  (e.g.  correctness  of  the  conditional  mean 
and/or  correctness  of  the  conditional  variance) .   It  is  hoped  that 
the  computational  simplicity  of  the  methods  proposed  here  removes 
some  of  the  barriers  to  using  robust  test  statistics  in  practice. 

The  possibility  of  generating  simple  test  statistics  when 

1/2  •"• 
T    (©T-  -  9  )  has  a  complicated  limiting  distribution  should  be 
1     o 

useful  in  several  situations.   The  tobit  example  in  Section  3  is 
only  one  case  where  the  conditional  mean  parameters  are  estimated 
using  a  method  other  than  the  efficient  WNLLS  procedure.   Another 
example  is  choosing  between  log-linear  and  linear — linear 
specifications.   In  this  case,  both  models  can  be  estimated  by  DLS, 
and  then  transformed  in  the  manner  of  the  tobit  example  to  obtain 
estimates  of  E(Y  [X  )  for  the  separate  models. 

Theorem  2.1  can  be  extended  to  certain  unit  root  time  series 
models.   The  initial  purging  of  C   ^^^7  V?   from  C.  ^A,  in  some  cases 

tot  t     t 

results  in  indicators  that  are  effectively  stationary.   This  is  the 
case  for  the  LM  test  in  linear  time  series  models  where  the 
regressors  excluded  under  the  null  hypothesis  are  individually 
cointegrated  with  the  regressors  included  under  the  null. 
Statistics  derived  from  Theorem  2.1  have  the  advantage  over  the 
usual  Wald  or  Ln  tests  of  being  robust  to  conditional 
heteroskedasticity  under  H  .   Extending  Theorem  2.1  to  general 
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nonstationary  time  series  models  is  left  for  future  research, 


Footnotes 

1.       If    E[U.(e    )^V.m,(e    )'X.(6°)]    =    0    and    E[V^m.(e    )'X.(6°)]    =    O    then 
towtotl  Stotl 

the  regression  form  in  (2.10)  Ls    valid  in  the  presence  of 

heteroskedastici ty .   These  orthogonality  conditions  occur  only  in 

limited  cases.   One  example  is  testing  for  serial  correlation  in  a 

static  regression  model  (E(Y. |X.)  depends  only  on  Z   under  H  )  with 

static  heteroskedasticity  (V(Y. |X.)  depends  only  on  Z.  under  H^) . 

If  ECV^m.O  )'X.  (<5°)]  =  0  under  H.  then  a  simple  test  which  is 
©  t   o    t   I  O 

2 

robust  in  the  presence  of  arbitrary  heteroskedasticity  is  TR   from 


the  regression 


1   on   Llu-^^.      't=l,...,T, 


•V.       ^^ 


that  is,  U  V  m   can  be  omitted  from  the  auxiliary  regression, 
tot 

2.   Hal  White  has  suggested  an  interesting  extension  to  Theorem  2.1. 

First,  there  is  no  need  to  split  <^.(Y  ,X  ,9)  into  "n.  (Y  ,X  ,0)  and 

cp  (X  ,e).   Then,  instead  of  imposing  (ii.b),  use  $  (X.,G^)  in  the 

purging  regression,  where  f  (X  ,e)  =  E  [  V  4'.  ( Y  ,  X  ,  9)  |  X  ]  .   Note 

tt         9©ttt       t 

that  it  is  now  important  to  index  the  expectation  operator  by  6. 
This  expectation  is  the  common  expectation  of  the  equivalence  class 
P^  of  probability  measures  defined  as  follows:   P  e  J'iQ)     if  and  only 


IT 


and 


^p^^t^^t'^t'®^  ix^:  =  o 


Ep[V^0^(Y^,X^,e) |X^]  =  5^(X^,e)      t=l,2, 
The  need  to  compute  $  (X  ,9)  generally  imposes  additional 


restrictions  under  the  null  hypothesis.   However,  this  more  general 
setup  would  allow  robust  tests  in  certain  situations  not  covered  by 
Theorem  2.1,  such  as  tests  for  dynamic  linear  models  estimated  by 
two  stage  least  squares. 


24 


References 


1.  BreuBch,  T.S.  and  A.R.  Pagan.  A  simple  test  for  heteroskedastici ty 

and  random  coefficient  variation.  Econometr ica    47  (1979): 
1237-1294. 

2.  Cragg,  J.G.  Some  statistical  models  for  limited  dependent 

variables  with  applications  to  the  demand  for  durable  goods. 
Econometrica    39  (1971):  B29-B44. 

3.  Davidson,  R.  and  J.G.  Mach'innon.  Several  tests  of  model 

specification  in  the  presence  of  alternative  hypotheses. 
Econosietrica    49  (1981):  7B1-793. 

4.  Domowitz  I.  and  H.  White.  Misspecified  models  with  dependent 

observations.  Journal    of    Econometrics    20  (1982):  35-58. 

5.  Engle,  R.F.  Autoregressive  conditional  heteroskedasticity  with 

estimates  of  United  Kingdom  inflation.  Econometr ica    50  (1982): 
987-1008. 

6.  Hansen,  L.P.  Large  sample  properties  of  generalized  method  of 

moments  estimators.  Econometr  ica    50  (1982):  1029-1054. 

7.  Heckman,  J.J.  The  common  structure  of  statistical  models  of 

truncation,  sample  selection  and  limited  dependent  variables 
and  a  simple  estimator  for  such  models.  Annals    of    Economic 
and    Social    Measurement    5  (1976):  475-492. 

8.  Hausman ,  J. A.  Specification  tests  in  econometrics.  Econometr  ica 

46  (1978):  1251-1271. 

9.  Hsieh,  D.A.  A  heteroskedasticity-consistent  covariance  matrix 

estimator  for  time  series  regressions.  Journal     of    Econometr ics 
22  (1983):  281-290. 

10.  Newsy,  W.f^C.  Maximum  likelihood  specification  testing  and 

conditional  moment  tests,"  Econometr  ica    53  (1985):  1047-1070. 

11.  Tauchen,  G.  Diagnostic  testing  and  evaluation  of  maximum 

likelihood  models.  Journal    of    Econometrics    30  (1985):  415-443. 

12.  White,  H.  A  heteroskedasticity-consistent  covariance 

matrix  estimator  and  a  direct  test  for  heteroskedasticity- 
Econometr  ica    48  (1980a):  817-838. 

13.  White,  H.  Nonlinear  regression  on  cross  section  data. 

Econometr  ica    48  (19S0b):  721-746. 

14.  White,  H.  Maximum  likelihood  estimation  of  misspecified 

models.  Econometr  ica    50  (1982):  1-26. 


25 


15.  White,  H.  Asymptot  ic    Theory    for    Ecoriometr  icians .    New  York: 

Acamedic  Press,  1984. 

16.  White,  H.  Specification  testing  in  dynamic  models. 

Invited  Paper  at  the  Fifth  World  Congress  of  the  Econometric 
Society,  Cambridge,  Massachusetts,  August  1985. 

17=  White,  H.  and  I.  Domowitz.  Nonlinear  regression  with 

dependent  observations.  Econowetr ica    52  (1984):  143-162. 

18.  Wooldridge,  J.li.  Asymptotic    properties    of   econonetr ic 

estimators.    UCSD  Ph.D.  dissertation,  1986. 

19.  Wooldridge,  J.M.  Specification  testing  and  quasi— maximum 

likelihood  estimation.  Mimeo,  MIT,  1987= 


26 


Mathematical  Appendix 

For  convenience,  I  include  a  lemma  which  is  used  repeatedly  in 
the  proof  of  Theorem  2.1. 

Lemma  A . 1 :   Assume  that  the  sequence  of  random  functions  [Q^CW^,©): 

©  e  ©,  T=l,2,...},  where  Q  (W^, • )  is  continuous  on  ©  and  ©  is  a 

p 
compact  subset  of  [R  ,  and  the  sequence  of  nonrandom  functxons 

■CQ-(©):  e  «  ©,  T=l,2,...},  satisfy  the  following  conditions:       i 

(i)  sup  |Q  (W  ,©)  -  Q  (©)]|  5  0; 

Oe© 

(ii)   [Q  (©):  ©  e  ©,  T=l,2,...}  is  continuous  on  © 
uniformly  in  T. 

Let  G   be  a  sequence  of  random  vectors  such  that  ©^  -  ©^  -♦  0 
where  {©^>  <z    ©.   Then 

Q^(W^,©^)  -  Qy(©y)  5  0. 
Proof:  see  Wooldridge  CIS,  Lemma  A.l,  p. 229].   ■ 

A  definition  simplifies  the  statement  of  the  conditions. 

Definition  A.l:   A  sequence  of  random  functions  [q.  ( Y  , X .  , ©) :  ©  «  ©, 

t=l,2,...]-,  where  q.(Y  ,X  ,)  is  continuous  on  ©  and  ©  is  a  compact 

subset  of  K  ,  is  said  to  satisfy  the  Uniform  Uleak  Law  of  Large 

Numbers  ( UWLLN )  and  Uniform  Continuity  (UC)  conditions  provided  that 

T 
(i)  sup   |T    Z  n^(Y^,X^,©)  -  E[q. (Y, ,X. ,©)] I  2o 
©^       t=l  "   "   -  t   t   t 

and 

-1  T 
(ii)   iT         Z   £i:q^.(Y  ,X  ,©)]:  ©  e  ©,  T=l,2,...}  is  0(1)  and 

t=l 
continuous  on  ©  uniformly  in  T.   ■ 
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In  the  statement  of  the  conditions,  the  dependence  of  functions 
on  the  variables  Y   and  X   is  frequently  suppressed  for  notational 
convenience.   If  a(Q)  is  a  IxL  function  of  the  Pxl  vector  9  then,  by 
convention,  V.aO)  is  the  L:<P  matrix  V^[a(G)'].   If  A(G)  is  a  Q:<L 
matrix  then  the  matrix  V  A(9)  is  the  LQxP  matrix  defined  as 

where  A.(e>)  is  the  j  th  row  of  A(e)  and  y_A.(©)  is  the  LxP  gradient 
J  tj  J 

of  A.(©)  as  defined  as  above.   For  simplicity,  for  any  Lxl  vector 
function  ip,  define  the  second  derivative  of  ip  to  be  the  LPxP  matrix 

Conditions  A. 1 ; 

P  N 

(i)  ©  <z  K   and  fl  c  D?   are  compact  and  have  nonempty  interiors; 

(ii)  e   e  int(©),  {n°:  T=l,2,...}  c  int(n)  uniformly  in  T; 
o  ■    T      ' 

(iii)  (a)  -Cia,  (y .  ,  X  ,9)  :  9  e  ©}■  is  a  sequence  of  Lxl  functions 

such  that  Ti.  (•,©)  is  Borel  measurable  for  each  ©  e  ©  and  "n.  ( y .  .  x,  ,  •  ) 

t  t   t  ■  t 

is  continuously  dif f eren tiable  on  the  interior  of  ©  for  all  y. ,x  , 


t=l  '^ 


(b)   [^p  (x  ,©):  9  e  ©}  is  a  sequence  of  Lxl  functions  such 


that  cp^(-,9)  is  Borel  measurable  for  each  ©  e  ©  and  ip .  ( x  ,  •  )  is  twice 
continuously  dif f erentiable  on  the  interior  of  ©  for  all  x. , 


t=l,2. 


a      m      m      ^ 


(c)   [C  (x  ,6):  6  e  A}  is  a  sequence  of  LxL  matrices 


satisfying  the  measurabil ity  requirements,  C  (x  ,6)  is  symmetric  and 

positive  semi-definite  for  all  x   and  6,  and  C  (x  ,■)  is 


dif f erentiable  on  int(A)  for  all  x  ,  t=l,2. 
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( d )   [A  (>;  ,6):  6  e  A}  is  a  sequence  of  L;:Q  matrices 

satisfying  the  measurabi  1  ity  requirements,  and  A  ( ;■;  ,  ■  )  is 

dif f eren tiable  on  int(A)  for  all  x  ,  t=l,2,...; 

(iv)  (a)   T-'-''^(e^  -  e  )  =  0  (1); 

I     o      p 

(b)   T-^^^(TT^  -  n°)  =  0  (1); 

(V)   (a)  {VQip^(e)'C^(6)VQ(p^(e)}  and  {V^^p^O)  '  C^(  6  )  A^(  6  ) 

satistfy  the  UWLLN  and  UC  conditions; 

T 
(b)  [T    V   ECV^v:'°' C°V^ip°]}  is  uniformly  positive  definite; 
t=i   e  t  t  G  t 

(vi)  (a)  ^V^<p^(G)'C^(6)7QTi^(e)},  {[Ip  »   4-^(0) 'C^(6)]V^cp^(e)}, 

and  [7Q<p^(e)' [I^  »   *^(G)' ]V^C^(6)} 

satisfy  the  UWLLN  and  UC  conditions; 

-1/7  ^ 

(b)     "^         L  ^Qf°' ^1<pI    =    Op(i); 

(vii)  (a)  -rA^(6)'C^(6)V^Ti^(G)},  •[A^(5  ) '  C^(6  )  V^ip^C  G)  }  , 

ill^   «.  *^(e)'C^(6)]7^A^(6)' J, 

■CCIp  ®  4'^(G)'C^(6)]v|<p^(G)}, 

•CA^(6)' [I|_  ®  <43^(G)' ]V^C^(<S)},    and 

■:^Q^^(S)'  11^   ®  *^(G)' ]V^C^(6)} 

satisfy  the  UWLLN  and  UC  requirements; 

(viii)        (a)       Z^    ^   T"^    r  E[(A°-V^>p°Be)'C°*°*°'C°(A°-V^^°B?)] 
I  —  tGt:i  -cttttGtT 

is  uniformly  p.d.; 

(b)   Z^-^^V^/^  E  (A°   -  Ve^?B°)'C°<,°   5   N(0,Iq); 

( c  )   ^ A^  ( 6 ) '  C^  ( 6  )  4p^  ( G)  0^  ( G) '  C^  ( 6 )  A^  ( 6 )  }  , 

•[A^(6)'C^(6)c{:^(e)0^(e)'C^(6)VQ^^(e)},  and 
[Vg^p^  (  G)  '  C^  (  6  )  4,^  (  G)  43^  (  G)  '  C^  (  6  )  V^^?^  (  G)  } 
satisfy  the  UWLLN  and  UC  conditions.   ■ 
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Proof  of  Theorem  2.1:   First,  note  that  assumptions  (i)-(vi)  ensure 
existence  of  B^  and  imply  that  B^  -  B^  =  o  (1)  by  Lemma  A.l. 


Therefore, 

-1  /'?         '^       -^   n   ■■"-  -^ 

(a.l) 


^\  rf^   ^X 


^T  =  "^  "^^^^  -   ^eVT^'^t^t 

-  (B^  -  B°)'T-^/^E^7^^^'C^4't- 

'^      o 
Consider  the  term  post-multiplying  (B^  -  B^)'.   A  standard  mean 

value  expansion  about  &    ^    assumption  (vi.a),  and  Lemma  A.l  yield 
--I/2I  „   ^    .-^    ^  ^-1/2!,  „  0.^0    a 


^       ;E,^e^t'^t*t  =  ^      ;5/e^t'^;'^t  '^-' 


e_) 


T       D 


+  T~^Z   m^   ®  *t'^^6^t^  T^^^lS^   -    6^)       +      Op(l). 
The  first  term  on  the  right  hand  side  of  (a. 2)  is  0  (1)  by  (vi.b). 
By  (vi.a)  and  (iv.a,b),  the  terms  in  lines  two  and  three  of  (a. 2) 

are    also  0  (1).   Therefore, 
P 

Along  with  B^  -  B^  =  o  (1),  this  establishes  that  under  H. , 
T     T     p  O 

?T  =  t'^-^^j:  [A^  -  ^e^t^T^'^t^t  ^   °D^^^"  ^^""^^ 

t=l 

A  mean  value  expansion,  assumption  (vii),  and  Lemma  A.l  yield 

,  ,^T 


+ 


T-4  t:iQ  -  *°t^P^s'-°t    -    CA?-v^^°E°rti^  .  *?'3V?> 


T^''-(2^  -  i?) 
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+   o  (1) . 

p 

Consider  the  second  line  of  (a. 5).   It  must  be  shown  that  the 
average  appearing  there  is  o  (1)  under  H..   First,  note  that 

-  ^'4  tA°-V3^?B°rc°,3^°. 

By  (ii.b)  of  the  text,  ECV^n^lX^]  =  O  under  H^.   Note  that  A°,  V_ip°, 

©  t   t  O  t     tp  t 

and  C.  depend  only  on  X  .   Also,  B-  is  defined  such  that 

T  "-£   E[(A°-Vq(p°B°)'C°Vq^°]  =0.  (a.  7) 

The  regularity  conditions  imposed  imply  that  each  of  the  averages  on 
the  right  hand  side  of  (a. 6)  satisfy  the  WLLN.   Therefore 

Because  E[<}j  |X  ]  =  O  under  H^,  it  is  even  easier  to  show  that  the 

1/2  '^ 
remaining  sample  averages  in  (a. 5)  are  o  (1).   Combined  with  T   ^(^^t 

—  6^)  =0  (1)  this  establishes  the  first  conclusion  of  the  theorem: 
'      P 

^V  =  T-^^4  :A°  -  Vg^°B°]'C°4>°   -Op(l).  (a. 9) 

Given  (viii.a),  the  asymptotic  covariance  matrix  of  ?^  is  uniformly 

positive  definite.   Moreover,  2L.    ^^  ■*       N(0,I  )  under  H^  by 
(viii.b).   Condition  (viii.c)  ensures  that 

Z^  =   T    ^   [(A^   -  ^©«Pt^j)'Ct4>^*;.C^(A^   -  ^e^t®y)3     (a. 10) 

is  a  consistent  estimator  of  ZL..   It  is  easy  to  see  that 

fjJ.ZLJ:^^'^   =   TR^,  (a.  11) 

where  R^  is  the  uncentered  r-squared  from  the  regression 
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1    on  'P'*.^*.  t=l,...,T, 


(a. 12) 


and  <i)      and  A   are  as  defined  in  the  text. 
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